36 research outputs found

    Transforming Graph Representations for Statistical Relational Learning

    Full text link
    Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of statistical relational learning (SRL) algorithms to these domains. In this article, we examine a range of representation issues for graph-based relational data. Since the choice of relational data representation for the nodes, links, and features can dramatically affect the capabilities of SRL algorithms, we survey approaches and opportunities for relational representation transformation designed to improve the performance of these algorithms. This leads us to introduce an intuitive taxonomy for data representation transformations in relational domains that incorporates link transformation and node transformation as symmetric representation tasks. In particular, the transformation tasks for both nodes and links include (i) predicting their existence, (ii) predicting their label or type, (iii) estimating their weight or importance, and (iv) systematically constructing their relevant features. We motivate our taxonomy through detailed examples and use it to survey and compare competing approaches for each of these tasks. We also discuss general conditions for transforming links, nodes, and features. Finally, we highlight challenges that remain to be addressed

    Meta-Prediction for Collective Classification

    Get PDF
    When data instances are inter-related, as are nodes in a social network or hyperlink graph, algorithms for collective classification (CC) can significantly improve accuracy. Recently, an algorithm for CC named Cautious ICA (ICAC) was shown to improve accuracy compared to the popular ICA algorithm. ICAC improves performance by initially favoring its more confident predictions during collective inference. In this paper, we introduce ICAMC, a new algorithm that outperforms ICAC when the attributes that describe each node are not highly predictive. ICAMC learns a meta-classifier that identifies which node label predictions are most likely to be correct. We show that this approach significantly increases accuracy on a range of real and synthetic data sets. We also describe new features for the meta-classifier and demonstrate that a simple search can identify an effective feature set that increases accuracy

    Boreal forest soil carbon fluxes one year after a wildfire: Effects of burn severity and management

    Get PDF
    The extreme 2018 hot drought that affected central and northern Europe led to the worst wildfire season in Sweden in over a century. The Ljusdal fire complex, the largest area burnt that year (8995 ha), offered a rare opportunity to quantify the combined impacts of wildfire and post-fire management on Scandinavian boreal forests. We present chamber measurements of soil CO2 and CH4 fluxes, soil microclimate and nutrient content from five Pinus sylvestris sites for the first growing season after the fire. We analysed the effects of three factors on forest soils: burn severity, salvage-logging and stand age. None of these caused significant differences in soil CH4 uptake. Soil respiration, however, declined significantly after a high-severity fire (complete tree mortality) but not after a low-severity fire (no tree mortality), despite substantial losses of the organic layer. Tree root respiration is thus key in determining post-fire soil CO2 emissions and may benefit, along with heterotrophic respiration, from the nutrient pulse after a low-severity fire. Salvage-logging after a high-severity fire had no significant effects on soil carbon fluxes, microclimate or nutrient content compared with leaving the dead trees standing, although differences are expected to emerge in the long term. In contrast, the impact of stand age was substantial: a young burnt stand experienced more extreme microclimate, lower soil nutrient supply and significantly lower soil respiration than a mature burnt stand, due to a thinner organic layer and the decade-long effects of a previous clear-cut and soil scarification. Disturbance history and burn severity are, therefore, important factors for predicting changes in the boreal forest carbon sink after wildfires. The presented short-term effects and ongoing monitoring will provide essential information for sustainable management strategies in response to the increasing risk of wildfire

    Cautious Inference in Collective Classification

    No full text
    Collective classification can significantly improve accuracy by exploiting relationships among instances. Although several collective inference procedures have been reported, they have not been thoroughly evaluated for their commonalities and differences. We introduce novel generalizations of three existing algorithms that allow such algorithmic and empirical comparisons. Our generalizations permit us to examine how cautiously or aggressively each algorithm exploits intermediate relational data, which can be noisy. We conjecture that cautious approaches that identify and preferentially exploit the more reliable intermediate data should outperform aggressive approaches. We explain why caution is useful and introduce three parameters to control the degree of caution. An empirical evaluation of collective classification algorithms, using two base classifiers on three data sets, supports our conjecture
    corecore